Automatic Generation of Context-Based Fill-in-the-Blank Exercises Using Co-occurrence Likelihoods and Google n-grams

نویسندگان

  • Jennifer Hill
  • Rahul Simha
چکیده

In this paper, we propose a method of automatically generating multiple-choice fill-inthe-blank exercises from existing text passages that challenge a reader’s comprehension skills and contextual awareness. We use a unique application of word co-occurrence likelihoods and the Google n-grams corpus to select words with strong contextual links to their surrounding text, and to generate distractors that make sense only in an isolated narrow context and not in the full context of the passage. Results show that our method is successful at generating questions with distractors that are semantically consistent in a narrow context but inconsistent given the full text, with larger n-grams yielding significantly better results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distractor Generation for Chinese Fill-in-the-blank Items

This paper reports the first study on automatic generation of distractors for fill-inthe-blank items for learning Chinese vocabulary. We investigate the quality of distractors generated by a number of criteria, including part-of-speech, difficulty level, spelling, word co-occurrence and semantic similarity. Evaluations show that a semantic similarity measure, based on the word2vec model, yields...

متن کامل

Bringing contextual information to google speech recognition

In automatic speech recognition on mobile devices, very often what a user says strongly depends on the particular context he or she is in. The n-grams relevant to the context are often not known in advance. The context can depend on, for example, particular dialog state, options presented to the user, conversation topic, location, etc. Speech recognition of sentences that include these n-grams ...

متن کامل

TrWP: Text Relatedness using Word and Phrase Relatedness

Text is composed of words and phrases. In bag-of-word model, phrases in texts are split into words. This may discard the inner semantics of phrases which in turn may give inconsistent relatedness score between two texts. TrWP , the unsupervised text relatedness approach combines both word and phrase relatedness. The word relatedness is computed using an existing unsupervised co-occurrence based...

متن کامل

Google Web 1T 5-Grams Made Easy (but not for the computer)

This paper introduces Web1T5-Easy, a simple indexing solution that allows interactive searches of the Web 1T 5-gram database and a derived database of quasi-collocations. The latter is validated against co-occurrence data from the BNC and ukWaC on the automatic identification of non-compositional VPC.

متن کامل

Automatic classification of Non-alcoholic fatty liver using texture features from ultrasound images

Background: Accurate and early detection of non-alcoholic fatty liver, which is a major cause of chronic diseases is very important and is vital to prevent the complications associated with this disease. Ultrasound of the liver is the most common and widely performed method of diagnosing fatty liver. However, due to the low quality of ultrasound images, the need for an automatic and intelligent...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016